AITopics | non-smooth optimization

Collaborating Authors

non-smooth optimization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An Optimal Structured Zeroth-order Algorithm for Non-smooth Optimization

Neural Information Processing SystemsDec-26-2025, 02:02:12 GMT

Finite-difference methods are a class of algorithms designed to solve black-box optimization problems by approximating a gradient of the target function on a set of directions. In black-box optimization, the non-smooth setting is particularly relevant since, in practice, differentiability and smoothness assumptions cannot be verified. To cope with nonsmoothness, several authors use a smooth approximation of the target function and show that finite difference methods approximate its gradient. Recently, it has been proved that imposing a structure in the directions allows improving performance. However, only the smooth setting was considered.

name change, non-smooth optimization, optimal structured zeroth-order algorithm, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

An Optimal Structured Zeroth-order Algorithm for Non-smooth Optimization

Neural Information Processing SystemsJan-19-2025, 07:16:00 GMT

gradient, non-smooth optimization, optimal structured zeroth-order algorithm, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.41)

Add feedback

Distributed stochastic proximal algorithm with random reshuffling for non-smooth finite-sum optimization

Jiang, Xia, Zeng, Xianlin, Sun, Jian, Chen, Jie, Xie, Lihua

arXiv.org Artificial IntelligenceOct-10-2022

The non-smooth finite-sum minimization is a fundamental problem in machine learning. This paper develops a distributed stochastic proximal-gradient algorithm with random reshuffling to solve the finite-sum minimization over time-varying multi-agent networks. The objective function is a sum of differentiable convex functions and non-smooth regularization. Each agent in the network updates local variables with a constant step-size by local information and cooperates to seek an optimal solution. We prove that local variable estimates generated by the proposed algorithm achieve consensus and are attracted to a neighborhood of the optimal solution in expectation with an $\mathcal{O}(\frac{1}{T}+\frac{1}{\sqrt{T}})$ convergence rate, where $T$ is the total number of iterations. Finally, some comparative simulations are provided to verify the convergence performance of the proposed algorithm.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2111.0382

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Chongqing Province > Chongqing (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Richer Theory of Convex Constrained Optimization with Reduced Projections and Improved Rates

Yang, Tianbao, Lin, Qihang, Zhang, Lijun

arXiv.org Machine LearningJun-12-2017

This paper focuses on convex constrained optimization problems, where the solution is subject to a convex inequality constraint. In particular, we aim at challenging problems for which both projection into the constrained domain and a linear optimization under the inequality constraint are time-consuming, which render both projected gradient methods and conditional gradient methods (a.k.a. the Frank-Wolfe algorithm) expensive. In this paper, we develop projection reduced optimization algorithms for both smooth and non-smooth optimization with improved convergence rates under a certain regularity condition of the constraint function. We first present a general theory of optimization with only one projection. Its application to smooth optimization with only one projection yields $O(1/\epsilon)$ iteration complexity, which improves over the $O(1/\epsilon^2)$ iteration complexity established before for non-smooth optimization and can be further reduced under strong convexity. Then we introduce a local error bound condition and develop faster algorithms for non-strongly convex optimization at the price of a logarithmic number of projections. In particular, we achieve an iteration complexity of $\widetilde O(1/\epsilon^{2(1-\theta)})$ for non-smooth optimization and $\widetilde O(1/\epsilon^{1-\theta})$ for smooth optimization, where $\theta\in(0,1]$ appearing the local error bound condition characterizes the functional local growth rate around the optimal solutions. Novel applications in solving the constrained $\ell_1$ minimization problem and a positive semi-definite constrained distance metric learning problem demonstrate that the proposed algorithms achieve significant speed-up compared with previous algorithms.

artificial intelligence, machine learning, optimization, (15 more...)

arXiv.org Machine Learning

1608.03487

Country:

North America > United States > Iowa > Johnson County > Iowa City (0.14)
Asia > China > Jiangsu Province > Nanjing (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Inexact Proximal Gradient Methods for Non-convex and Non-smooth Optimization

Gu, Bin, Huo, Zhouyuan, Huang, Heng

arXiv.org Machine LearningDec-18-2016

Non-convex and non-smooth optimization plays an important role in machine learning. Proximal gradient method is one of the most important methods for solving the nonconvex and non-smooth problems, where a proximal operator need to be solved exactly for each step. However, in a lot of problems the proximal operator does not have an analytic solution, or is expensive to obtain an exact solution. In this paper, we propose inexact proximal gradient methods (not only a basic inexact proximal gradient method (IPG), but also a Nesterov's accelerated inexact proximal gradient method (AIPG)) for non-convex and non-smooth optimization, which tolerate an error in the calculation of the proximal operator. Theoretical analysis shows that IPG and AIPG have the same convergence rates as in the error-free case, provided that the errors decrease at appropriate rates. Keywords: Non-convex optimization, non-smooth optimization, proximal gradient, inexact proximal operator, Nesterov's accelerated method

artificial intelligence, machine learning, optimization, (13 more...)

arXiv.org Machine Learning

1612.06003

Country: North America > United States > Texas (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes

Shamir, Ohad, Zhang, Tong

arXiv.org Machine LearningDec-28-2012

Stochastic Gradient Descent (SGD) is one of the simplest and most popular stochastic optimization methods. While it has already been theoretically studied for decades, the classical analysis usually required non-trivial smoothness assumptions, which do not apply to many modern applications of SGD with non-smooth objective functions such as support vector machines. In this paper, we investigate the performance of SGD without such smoothness assumptions, as well as a running average scheme to convert the SGD iterates to a solution with optimal optimization accuracy. In this framework, we prove that after T rounds, the suboptimality of the last SGD iterate scales as O(log(T)/\sqrt{T}) for non-smooth convex objective functions, and O(log(T)/T) in the non-smooth strongly convex case. To the best of our knowledge, these are the first bounds of this kind, and almost match the minimax-optimal rates obtainable by appropriate averaging schemes. We also propose a new and simple averaging scheme, which not only attains optimal rates, but can also be easily computed on-the-fly (in contrast, the suffix averaging scheme proposed in Rakhlin et al. (2011) is not as simple to implement). Finally, we provide some experimental illustrations.

artificial intelligence, iterate, machine learning, (16 more...)

arXiv.org Machine Learning

1212.1824

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback